Improving the Lwazi ASR Baseline
نویسندگان
چکیده
We investigate the impact of recent advances in speech recognition techniques for under-resourced languages. Specifically, we review earlier results published on the Lwazi ASR corpus of South African languages, and experiment with additional acoustic modeling approaches. We demonstrate large gains by applying current state-of-the-art techniques, even if the data itself is neither extended nor improved. We analyze the various performance improvements observed, report on comparative performance per technique – across all eleven languages in the corpus – and discuss the implications of our findings for under-resourced languages in general.
منابع مشابه
Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu Languages
We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also repor...
متن کاملEACL 2009 Proceedings of the EACL 2009 Workshop on Language Technologies for African Languages
We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also repor...
متن کاملASR corpus design for resource-scarce languages
We investigate the number of speakers and the amount of data that is required for the development of useable speakerindependent speech-recognition systems in resource-scarce languages. Our experiments employ the Lwazi corpus, which contains speech in the eleven official languages of South Africa. We find that a surprisingly small number of speakers (fewer than 50) and around 10 to 20 hours of s...
متن کاملImproving the readability of class lecture ASR results using a confusion network
This paper presents a method for improving the readability of Automatic Speech Recognition (ASR) results for classroom lectures. Most of the previous research on improving the readability of recognition results focused mainly on manually transcribed texts, and not ASR results. Due to the presence of a large number of domain-dependent words and the casual presentation style, even state-of-the-ar...
متن کاملModel-based independent component analysis for robust multi-microphone automatic speech recognition
In this communication, we present a method for noise-robust multimicrophone automatic speech recognition (ASR). It is assumed that the speech source to be recognized is recorded with several microphones in a noisy acoustic environment. The proposed method estimates the short-term subband energies (as they are needed for computing the ASR front-end) of the clean speech source from the ones of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016